Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

When Bio-Inspired Computing meets Deep Learning: Low-Latency, Accurate, & Energy-Efficient Spiking Neural Networks from Artificial Neural Networks (2312.06900v1)

Published 12 Dec 2023 in cs.CV

Abstract: Bio-inspired Spiking Neural Networks (SNN) are now demonstrating comparable accuracy to intricate convolutional neural networks (CNN), all while delivering remarkable energy and latency efficiency when deployed on neuromorphic hardware. In particular, ANN-to-SNN conversion has recently gained significant traction in developing deep SNNs with close to state-of-the-art (SOTA) test accuracy on complex image recognition tasks. However, advanced ANN-to-SNN conversion approaches demonstrate that for lossless conversion, the number of SNN time steps must equal the number of quantization steps in the ANN activation function. Reducing the number of time steps significantly increases the conversion error. Moreover, the spiking activity of the SNN, which dominates the compute energy in neuromorphic chips, does not reduce proportionally with the number of time steps. To mitigate the accuracy concern, we propose a novel ANN-to-SNN conversion framework, that incurs an exponentially lower number of time steps compared to that required in the SOTA conversion approaches. Our framework modifies the SNN integrate-and-fire (IF) neuron model with identical complexity and shifts the bias term of each batch normalization (BN) layer in the trained ANN. To mitigate the spiking activity concern, we propose training the source ANN with a fine-grained L1 regularizer with surrogate gradients that encourages high spike sparsity in the converted SNN. Our proposed framework thus yields lossless SNNs with ultra-low latency, ultra-low compute energy, thanks to the ultra-low timesteps and high spike sparsity, and ultra-high test accuracy, for example, 73.30% with only 4 time steps on the ImageNet dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. Pfeiffer, M., et al.: Deep learning with spiking neurons: Opportunities and challenges. Frontiers in Neuroscience 12, 774 (2018) Bellec et al. [2018] Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., Maass, W.: Long short-term memory and learning-to-learn in networks of spiking neurons. arXiv preprint arXiv:1803.09574 (2018) arXiv:1803.09574 [cs.NE] Neftci et al. [2019] Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., Maass, W.: Long short-term memory and learning-to-learn in networks of spiking neurons. arXiv preprint arXiv:1803.09574 (2018) arXiv:1803.09574 [cs.NE] Neftci et al. [2019] Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  2. Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., Maass, W.: Long short-term memory and learning-to-learn in networks of spiking neurons. arXiv preprint arXiv:1803.09574 (2018) arXiv:1803.09574 [cs.NE] Neftci et al. [2019] Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  3. Neftci, E.O., Mostafa, H., Zenke, F.: Surrogate gradient learning in spiking neural networks: Bringing the power of gradient-based optimization to spiking neural networks. IEEE Signal Processing Magazine 36(6), 51–63 (2019) O’Connor et al. [2018] O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  4. O’Connor, P., Gavves, E., Reisser, M., Welling, M.: Temporally efficient deep learning with spikes. In: International Conference on Learning Representations (2018). https://openreview.net/forum?id=HkZy-bW0- Wu et al. [2018] Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  5. Wu, Y., Deng, L., Li, G., Zhu, J., Shi, L.: Spatio-temporal backpropagation for training high-performance spiking neural networks. Frontiers in Neuroscience 12 (2018) Zenke and Ganguli [2018] Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  6. Zenke, F., Ganguli, S.: SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks. Neural Computation 30(6), 1514–1541 (2018) https://doi.org/10.1162/neco_a_01086 Meng et al. [2022] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  7. Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Training high-performance low-latency spiking neural networks by differentiation on spike representation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 12444–12453 (2022) Xiao et al. [2022] Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  8. Xiao, M., Meng, Q., Zhang, Z., He, D., Lin, Z.: Online training through time for spiking neural networks. In: Koyejo, S., Mohamed, S., Agarwal, A., Belgrave, D., Cho, K., Oh, A. (eds.) Advances in Neural Information Processing Systems, vol. 35, pp. 20717–20730. Curran Associates, Inc., ??? (2022). https://proceedings.neurips.cc/paper_files/paper/2022/file/82846e19e6d42ebfd4ace4361def29ae-Paper-Conference.pdf Sengupta et al. [2019] Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  9. Sengupta, A., et al.: Going deeper in spiking neural networks: VGG and residual architectures. Frontiers in Neuroscience 13, 95 (2019) Rueckauer et al. [2017] Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  10. Rueckauer, B., et al.: Conversion of continuous-valued deep networks to efficient event-driven networks for image classification. Frontiers in Neuroscience 11, 682 (2017) Fang et al. [2021] Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  11. Fang, W., Yu, Z., Chen, Y., Masquelier, T., Huang, T., Tian, Y.: Incorporating learnable membrane time constant to enhance learning of spiking neural networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 2661–2671 (2021) Deng and Gu [2021] Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  12. Deng, S., Gu, S.: Optimal conversion of conventional artificial neural networks to spiking neural networks. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=FZ1oTwcXchK Bu et al. [2022] Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  13. Bu, T., Ding, J., Yu, Z., Huang, T.: Optimized potential initialization for low-latency spiking neural networks. Proceedings of the AAAI Conference on Artificial Intelligence 36(1), 11–20 (2022) https://doi.org/10.1609/aaai.v36i1.19874 Hao et al. [2023a] Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  14. Hao, Z., Ding, J., Bu, T., Huang, T., Yu, Z.: Bridging the gap between ANNs and SNNs by calibrating offset spikes. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=PFbzoWZyZRX Hao et al. [2023b] Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  15. Hao, Z., Bu, T., Ding, J., Huang, T., Yu, Z.: Reducing ann-snn conversion error through residual membrane potential. Proceedings of the AAAI Conference on Artificial Intelligence 37(1), 11–21 (2023) https://doi.org/10.1609/aaai.v37i1.25071 Bengio et al. [2013] Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  16. Bengio, Y., Léonard, N., Courville, A.: Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432 (2013) arXiv:1308.3432 [cs.LG] Bu et al. [2022] Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  17. Bu, T., Fang, W., Ding, J., Dai, P., Yu, Z., Huang, T.: Optimal ANN-SNN conversion for high-accuracy and ultra-low-latency spiking neural networks. In: International Conference on Learning Representations (2022). https://openreview.net/forum?id=7B3IJMM1k_M Datta et al. [2022] Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  18. Datta, G., et al.: ACE-SNN: Algorithm-Hardware co-design of energy-efficient & low-latency deep spiking neural networks for 3D image recognition. Frontiers in Neuroscience 16 (2022) Davies et al. [2018] Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  19. Davies, M., et al.: Loihi: A neuromorphic manycore processor with on-chip learning. IEEE Micro 38(1), 82–99 (2018) https://doi.org/10.1109/MM.2018.112130359 Sengupta et al. [2016] Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  20. Sengupta, A., et al.: Hybrid spintronic-CMOS spiking neural network with on-chip learning: Devices, circuits, and systems. Phys. Rev. Applied 6 (2016) Datta et al. [2021] Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  21. Datta, G., et al.: Training energy-efficient deep spiking neural networks with single-spike hybrid input encoding. In: IJCNN, vol. 1, pp. 1–8 (2021). https://doi.org/10.1109/IJCNN52387.2021.9534306 Rathi et al. [2020] Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  22. Rathi, N., et al.: DIET-SNN: Direct input encoding with leakage and threshold optimization in deep spiking neural networks. arXiv preprint arXiv:2008.03658 (2020) arXiv:2008.03658 [cs.NE] Wang et al. [2022] Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  23. Wang, Z., Gu, X., Goh, R.S.M., Zhou, J.T., Luo, T.: Efficient spiking neural networks with radix encoding. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–13 (2022) https://doi.org/10.1109/TNNLS.2022.3195918 Kim et al. [2018] Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  24. Kim, J., Kim, H., Huh, S., Lee, J., Choi, K.: Deep neural networks with weighted spikes. Neurocomputing 311, 373–386 (2018) https://doi.org/10.1016/j.neucom.2018.05.087 Cao et al. [2015] Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  25. Cao, Y., et al.: Spiking deep convolutional neural networks for energy-efficient object recognition. IJCV 113, 54–66 (2015) Diehl et al. [2015] Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  26. Diehl, P.U., et al.: Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing. In: 2015 International Joint Conference on Neural Networks (IJCNN), vol. 1, pp. 1–8 (2015) Hu et al. [2018] Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  27. Hu, Y., et al.: Spiking deep residual network. arXiv preprint arXiv:1805.01352 (2018) arXiv:1805.01352 [cs.NE] Rathi et al. [2020] Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  28. Rathi, N., et al.: Enabling deep spiking neural networks with hybrid conversion and spike timing dependent backpropagation. arXiv preprint arXiv:2005.01807 (2020) arXiv:2005.01807 [cs.LG] Kim et al. [2019] Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  29. Kim, S., Park, S., Na, B., Yoon, S.: Spiking-YOLO: Spiking neural network for energy-efficient object detection. arXiv preprint arXiv:1903.06530 (2019) arXiv:1903.06530 [cs.CV] Li et al. [2021] Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  30. Li, Y., Deng, S., Dong, X., Gong, R., Gu, S.: A free lunch from ann: Towards efficient, accurate spiking neural networks calibration. In: International Conference on Machine Learning, pp. 6316–6325 (2021). PMLR Park et al. [2019] Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  31. Park, S., Kim, S., Choe, H., Yoon, S.: Fast and efficient information transmission with burst spikes in deep spiking neural networks. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), vol. 1, pp. 1–6 (2019) Li and Zeng [2022] Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  32. Li, Y., Zeng, Y.: Efficient and accurate conversion of spiking neural network with burst spikes. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2485–2491. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/345 . Main Track. https://doi.org/10.24963/ijcai.2022/345 Wang et al. [2022] Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  33. Wang, Y., Zhang, M., Chen, Y., Qu, H.: Signed neuron with memory: Towards simple, accurate and high-efficient ann-snn conversion. In: Raedt, L.D. (ed.) Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pp. 2501–2508. International Joint Conferences on Artificial Intelligence Organization, ??? (2022). https://doi.org/10.24963/ijcai.2022/347 . Main Track. https://doi.org/10.24963/ijcai.2022/347 Wang et al. [2023] Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  34. Wang, B., Cao, J., Chen, J., Feng, S., Wang, Y.: A new ann-snn conversion method with high accuracy, low latency and good robustness. In: Elkind, E. (ed.) Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI-23, pp. 3067–3075. International Joint Conferences on Artificial Intelligence Organization, ??? (2023). https://doi.org/10.24963/ijcai.2023/342 . Main Track. https://doi.org/10.24963/ijcai.2023/342 Jiang et al. [2023] Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  35. Jiang, H., Anumasa, S., De Masi, G., Xiong, H., Gu, B.: A unified optimization framework of ANN-SNN conversion: Towards optimal mapping from activation values to firing rates. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 14945–14974. PMLR, ??? (2023). https://proceedings.mlr.press/v202/jiang23a.html Meng et al. [2022] Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  36. Meng, Q., Yan, S., Xiao, M., Wang, Y., Lin, Z., Luo, Z.-Q.: Training much deeper spiking neural networks with a small number of time-steps. Neural Networks 153, 254–268 (2022) https://doi.org/10.1016/j.neunet.2022.06.001 Lee et al. [2016] Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  37. Lee, J.H., et al.: Training deep spiking neural networks using backpropagation. Frontiers in Neuroscience 10 (2016) Panda and Roy [2016] Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  38. Panda, P., Roy, K.: Unsupervised regenerative learning of hierarchical features in spiking deep networks for object recognition. arXiv preprint arXiv:1602.01510 (2016) arXiv:1602.01510 [cs.NE] Wu et al. [2021] Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  39. Wu, J., Chua, Y., Zhang, M., Li, G., Li, H., Tan, K.C.: A tandem learning rule for effective training and rapid inference of deep spiking neural networks. IEEE Transactions on Neural Networks and Learning Systems 1(1), 1–15 (2021) https://doi.org/10.1109/TNNLS.2021.3095724 Zenke and Vogels [2021] Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  40. Zenke, F., Vogels, T.P.: The Remarkable Robustness of Surrogate Gradient Learning for Instilling Complex Function in Spiking Neural Networks. Neural Computation 33(4), 899–925 (2021) https://doi.org/10.1162/neco_a_01367 https://direct.mit.edu/neco/article-pdf/33/4/899/1902294/neco_a_01367.pdf Meng et al. [2023] Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  41. Meng, Q., Xiao, M., Yan, S., Wang, Y., Lin, Z., Luo, Z.-Q.: Towards memory- and time-efficient backpropagation for training spiking neural networks. arXiv preprint arXiv:2302.14311 (2023) 2302.14311 Guo et al. [2022] Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  42. Guo, Y., Tong, X., Chen, Y., Zhang, L., Liu, X., Ma, Z., Huang, X.: Recdis-snn: Rectifying membrane potential distribution for directly training spiking neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 326–335 (2022) Guo et al. [2023] Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  43. Guo, Y., Liu, X., Chen, Y., Zhang, L., Peng, W., Zhang, Y., Huang, X., Ma, Z.: Rmp-loss: Regularizing membrane potential distribution for spiking neural networks. arXiv preprint arXiv:2308.06787 (2023) 2308.06787 Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  44. Guo, Y., Chen, Y., Zhang, L., Liu, X., Wang, Y., Huang, X., Ma, Z.: IM-loss: Information maximization loss for spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=Jw34v_84m2b Duan et al. [2022] Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  45. Duan, C., Ding, J., Chen, S., Yu, Z., Huang, T.: Temporal effective batch normalization in spiking neural networks. In: Oh, A.H., Agarwal, A., Belgrave, D., Cho, K. (eds.) Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=fLIgyyQiJqz Zheng et al. [2021] Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  46. Zheng, H., et al.: Going deeper with directly-trained larger spiking neural networks. AAAI 35(12), 11062–11070 (2021) Kim et al. [2020] Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  47. Kim, Y., et al.: Revisiting batch normalization for training low-latency deep spiking neural networks from scratch. arXiv preprint arXiv:2010.01729 (2020) arXiv:2010.01729 [cs.CV] Guo et al. [2023] Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  48. Guo, Y., Zhang, Y., Chen, Y., Peng, W., Liu, X., Zhang, L., Huang, X., Ma, Z.: Membrane potential batch normalization for spiking neural networks. arXiv preprint arXiv:2308.08359 (2023) 2308.08359 Datta and Beerel [2022] Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  49. Datta, G., Beerel, P.A.: Can deep neural networks be converted to ultra low-latency spiking neural networks? In: DATE, vol. 1, pp. 718–723 (2022). https://doi.org/10.23919/DATE54114.2022.9774704 Lecun et al. [1998] Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  50. Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998) https://doi.org/10.1109/5.726791 Krizhevsky et al. [2009] Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  51. Krizhevsky, A., et al.: Learning multiple layers of features from tiny images. Technical Report 0, University of Toronto, Toronto, Ontario (2009) Deng et al. [2009] Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  52. Deng, J., et al.: ImageNet: A Large-Scale Hierarchical Image Database. In: CVPR (2009) Simonyan and Zisserman [2014] Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  53. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) He et al. [2016] He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  54. He, K., et al.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016) Sandler et al. [2018] Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  55. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C.: Mobilenetv2: Inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018) Ding et al. [2021] Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  56. Ding, J., Yu, Z., Tian, Y., Huang, T.: Optimal ann-snn conversion for fast and accurate inference in deep spiking neural networks. In: Zhou, Z.-H. (ed.) Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, pp. 2328–2336. International Joint Conferences on Artificial Intelligence Organization, ??? (2021). https://doi.org/10.24963/ijcai.2021/321 . Main Track. https://doi.org/10.24963/ijcai.2021/321 Wang et al. [2023] Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  57. Wang, Z., Lian, S., Zhang, Y., Cui, X., Yan, R., Tang, H.: Towards lossless ann-snn conversion under ultra-low latency with dual-phase optimization. arXiv preprint arXiv:2205.07473 (2023) 2205.07473 Li et al. [2021] Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  58. Li, Y., Guo, Y., Zhang, S., Deng, S., Hai, Y., Gu, S.: Differentiable spike: Rethinking gradient-descent for training spiking neural networks. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439. Curran Associates, Inc., ??? (2021). https://proceedings.neurips.cc/paper/2021/file/c4ca4238a0b923820dcc509a6f75849b-Paper.pdf Deng et al. [2022] Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  59. Deng, S., et al.: Temporal efficient training of spiking neural network via gradient re-weighting. In: ICLR (2022). https://openreview.net/forum?id=_XNtisL32jv Guo et al. [2022] Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  60. Guo, Y., Chen, Y., Zhang, L., Wang, Y., Liu, X., Tong, X., Ou, Y., Huang, X., Ma, Z.: Reducing information loss for spiking neural networks. In: Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XI, pp. 36–52. Springer, Berlin, Heidelberg (2022). https://doi.org/10.1007/978-3-031-20083-0_3 . https://doi.org/10.1007/978-3-031-20083-0_3 Deng et al. [2023] Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  61. Deng, S., Lin, H., Li, Y., Gu, S.: Surrogate module learning: Reduce the gradient error accumulation in training spiking neural networks. In: Krause, A., Brunskill, E., Cho, K., Engelhardt, B., Sabato, S., Scarlett, J. (eds.) Proceedings of the 40th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 202, pp. 7645–7657. PMLR, ??? (2023). https://proceedings.mlr.press/v202/deng23d.html You et al. [2020] You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  62. You, H., Chen, X., Zhang, Y., Li, C., Li, S., Liu, Z., Wang, Z., Lin, Y.: Shiftaddnet: A hardware-inspired deep network. In: Thirty-fourth Conference on Neural Information Processing Systems (2020) Horowitz [2014] Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  63. Horowitz, M.: Computing’s energy problem (and what we can do about it). In: ISSCC, pp. 10–14 (2014) Gholami et al. [2021] Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  64. Gholami, A., Kim, S., Dong, Z., Yao, Z., Mahoney, M.W., Keutzer, K.: A survey of quantization methods for efficient neural network inference. arXiv preprint arXiv:2103.13630 (2021) 2103.13630 Sekikawa and Yashima [2023] Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  65. Sekikawa, Y., Yashima, S.: Bit-pruning: A sparse multiplication-less dot-product. In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=YUDiZcZTI8 Davies et al. [2021] Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  66. Davies, M., Wild, A., Orchard, G., Sandamirskaya, Y., Guerra, G.A.F., Joshi, P., Plank, P., Risbud, S.R.: Advancing neuromorphic computing with loihi: A survey of results and outlook. Proceedings of the IEEE 109(5), 911–934 (2021) https://doi.org/10.1109/JPROC.2021.3067593 Deng et al. [2020] Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  67. Deng, L., Wang, G., Li, G., Li, S., Liang, L., Zhu, M., Wu, Y., Yang, Z., Zou, Z., Pei, J., Wu, Z., Hu, X., Ding, Y., He, W., Xie, Y., Shi, L.: Tianjic: A unified and scalable chip bridging spike-based and continuous neural computation. IEEE Journal of Solid-State Circuits 55(8), 2228–2246 (2020) https://doi.org/10.1109/JSSC.2020.2970709 Bottou [2012] Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  68. Bottou, L.: In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Stochastic Gradient Descent Tricks, pp. 421–436. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35289-8_25 . https://doi.org/10.1007/978-3-642-35289-8_25 Loshchilov and Hutter [2017] Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  69. Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (2017). https://openreview.net/forum?id=Skq89Scxx DeVries and Taylor [2017] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  70. DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017) arXiv:1708.04552 [cs.CV] Cubuk et al. [2019] Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  71. Cubuk, E.D., Zoph, B., Mane, D., Vasudevan, V., Le, Q.V.: Autoaugment: Learning augmentation strategies from data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) Fang et al. [2020] Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  72. Fang, W., Chen, Y., Ding, J., Yu, Z., Masquelier, T., Chen, D., Huang, L., Zhou, H., Li, G., Tian, Y., et al.: SpikingJelly. https://github.com/fangwei123456/spikingjelly. Accessed: YYYY-MM-DD (2020) Datta et al. [2022] Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  73. Datta, G., et al.: Hoyer regularizer is all you need for ultra low-latency spiking neural networks. arXiv preprint arXiv:2212.10170 (2022) 2212.10170 [cs.CV] Yin et al. [2022] Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  74. Yin, R., et al.: SATA: Sparsity-aware training accelerator for spiking neural networks. IEEE TCAD (2022) https://doi.org/10.1109/TCAD.2022.3213211 Datta et al. [2023] Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902 Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
  75. Datta, G., et al.: In-sensor & neuromorphic computing are all you need for energy efficient computer vision. In: ICASSP, pp. 1–5 (2023). https://doi.org/10.1109/ICASSP49357.2023.10094902
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Gourav Datta (34 papers)
  2. Zeyu Liu (54 papers)
  3. James Diffenderfer (24 papers)
  4. Bhavya Kailkhura (108 papers)
  5. Peter A. Beerel (66 papers)